Introduction

In 2006 Elon Musk wrote a blog post titled, The Secret Tesla Motors Master Plan (just between you and me), where he detailed out the strategy for the company. In this post, he offered up a multi-step plan to achieve the mission slated by the company:

Tesla’s mission is to accelerate the world’s transition to sustainable energy.

The plan detailed four steps that would eventually lead to what is known as the Tesla Model 3, the first affordable, high-performance, no-compromise electric car. As of March 2020, the model 3 became the all-time best selling plug-in electric car surpassing the Nissan LEAF, and it accomplished this in just 2.5 years, versus ten years for the LEAF1.

As the plan noted, the goal was to build a more affordable car accessible to more people than the previous premium market products. The Model 3 was launched with a $35,000 price point making it competitive with entry-level German vehicles. It was met with an incredible reception garnering 200,000 pre-orders in the first 24 hours after it’s launch2. It’s since sold 500,000 units and continues to be loved.

However, while the company is doing fantastic, its stock at all-time highs and soon entering the S&P 5003 has seen its share of Quality issues. In June of 2020, J.D. Power released its annual quality study showing that Tesla was ranked last among 32 automotive brands4. Bloomberg performed a survey of 5,000 Model 3 owners published in October of 2019, where owners submitted details of their quality issues 5. Owners stated that the most significant problems were with paint and panel gaps. While the report found that cars’ defects cut in half over time, Tesla is still working to optimize its production.

This report will look at user’s discussions, mostly in 2020, from the Tesla Model 3 Discussion Forums to surface what is top of mind and what issues might still be effecting the Model 3. The forums are an open place where people can post topics, ask questions, or generally participate in the community. User forums are rich with information that can give an alternative view into customer sentiment, unlike Social Media or traditional Surveys.

Note: This report will not look to surface the root cause of issues, positive or negative, but rather surface topics that are most top-of-mind for customers and where the Tesla team could investigate further.

The plan’s four-step master plan:

  1. Build sports car
  2. Use that money to build an affordable car
  3. Use that money to build an even more affordable car
  4. While doing above, also provide zero emission electric power generation options

Tesla Model 3 Performance

# plotting and pipes
library(tidyverse)
library(stringr)
library(tidyr)

# text mining library
library(tm)
library(tidytext)
library(wordcloud)
library(reshape2)
library(textstem)
library(ggraph)
library(igraph)
library(widyr)
library(spacyr)
library(SnowballC)
library(topicmodels)
library(quanteda)
library(seededlda)
library(parallel)
library(ldatuning)

# date/time library
library(lubridate)

# Read in the tesla forum data
df <- read.csv('tesla_forums.csv')

# Adjust variable types
df$Time <- as_datetime(df$Time)
df$User <- as.factor(df$User)
df$Topic <- as.factor(df$Topic)
# Drop a small amount of rows with NA values
df <- drop_na(df)
# Removed all duplicates.  The scraping method used created quite a few.
df <- distinct(df)
# Remove the first topic, it's just the "how to use the forums" thread and doesn't aid in analysis
df <- df[-c(1:24), ]
# Add Doc_Id incrementing per Row
df <- df %>%
  mutate(doc_id = paste0("doc", row_number())) %>%
  select(doc_id, everything())
# Add a Column for Text Length
df$text_len <- str_count(df$Discussion)

Exploratory Data Analysis

Perform an Exploratory Data Analysis (EDA) to better understand the characteristics, extents, and shape of our data.

Dataset

The dataset was obtained from the Tesla Model 3 Discussion Forums on December 15, 2020. The data was extracted utilizing a method known as Web Scraping, which pulls a web page into memory and extracts HTML information from it. This is no different from web crawling or how a browser caches a page as you visit it. A GitHub Repo is maintained with the source code and extracted datasets.

Summary

df %>%
  select(Discussion, Time, text_len) %>%
  summary()
##   Discussion             Time                        text_len     
##  Length:54311       Min.   :2015-12-10 19:16:17   Min.   :   1.0  
##  Class :character   1st Qu.:2019-10-04 08:41:09   1st Qu.:  86.0  
##  Mode  :character   Median :2020-04-03 16:40:26   Median : 179.0  
##                     Mean   :2020-01-26 01:41:25   Mean   : 278.4  
##                     3rd Qu.:2020-07-29 17:58:40   3rd Qu.: 348.0  
##                     Max.   :2020-12-15 02:19:29   Max.   :7944.0
# Make a copy of the original DF so it can be referenced later.
df_select <- df
  • Discussions: There are a total of 54,311 discussion threads in this dataset after removing duplicates. This is essentially like a comment on a Facebook post. A Topic (not shown) is posted, and Discussions happen on those topics.
  • Time: Dates range from 2015-12-10 to 2020-12-15. The Median, Mean and 3rd Quartile are all in 2020 telling us that most of the dates in this set are in 2020.
  • Text_Len: Min length of text is 0 and max is 7,944 characters with a median of 179.0.

Topic & User Information

This data is stored because, for each discussion row, the topic title is repeated. Therefore we need to summarize the rows and aggregate them into counts for each unique topic. This way, we can also see how many discussions are

df_topics <- df_select %>%
  group_by(Topic) %>%
  summarise(count = n(), .groups="keep") %>%
  arrange(desc(count))
head(df_topics)
df_topics %>%
  ggplot(aes(count)) + 
  geom_histogram(fill="#cc0000", color="#ffffff", bins=30) +
  theme_minimal() +
  scale_y_log10() +
  labs(x = "Number of Discussions per Topic",
       y = "Count (Log10 Scale)",
      title = "Distributions of Discussions per Topic",
      subtitle = "Number of replies per unqiue thread"
      ) +
  theme(plot.title = element_text(face = "bold"))

Regarding the number of Discussions per Topic, a heavily right-skewed distribution with a range of 500-1,000 total topics with 0-25 discussions each. After 25 or so (x-axis), there are just a few with greater than 25 replies per topic. There are two topics above 75, as noted in the table above.

Total Topics

sprintf("There are %s unique topics", nrow(df_topics))
## [1] "There are 3676 unique topics"
df_users <- df_select %>% 
  group_by(User) %>%
  summarise(count = n(), .groups="keep") %>% 
  arrange(desc(count))
head(df_users, n=10)

The forums are quite active by various users. 8 users have over 1,000 posts in this dataset.

df_users %>%
  ggplot(aes(count)) + 
  geom_histogram(fill="#cc0000", color="#ffffff", bins=30) +
  theme_minimal() +
  scale_y_log10() +
  labs(x = "Number Posts",
       y = "Count (Log10 Scale)",
      title = "Distributions of Active Users",
      subtitle = "Number of unique entried per user name"
      ) +
  theme(plot.title = element_text(face = "bold"))

A large number of users have a very small number of posts, 1,500+. There are a small number that are extremely active on the forums having > 500 posts.

Total Users

sprintf("There are %s total unique users", nrow(df_users))
## [1] "There are 6548 total unique users"

Text Length Analysis

summary(df_select$text_len)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     1.0    86.0   179.0   278.4   348.0  7944.0

Discussion lengths for the dataset range from 0 characters to 7,944 with a median of 179 with a mean of 278.

df_select %>%
  ggplot(aes(text_len)) + 
  geom_histogram(fill="#cc0000", color="#ffffff", bins=30) +
  theme_minimal() +
  scale_y_log10() +
  labs(x = "Text Length",
       y = "Count (Log10 Scale)",
      title = "Distributions of Text Length",
      subtitle = "Per character counts of the replies to topics"
      ) +
  theme(plot.title = element_text(face = "bold"))

Text length for posts is right-skewed as well, with most posts being shorter in length. But there is a much more spread distribution towards the right tail.

Discussion Frequency

df %>%
  mutate(date = floor_date(Time, "week")) %>%
  group_by(date) %>%
  summarize(count = n(), .groups = 'keep') %>%
  
  
  ggplot(aes(date, count)) +
  geom_line(show.legend = FALSE, color="#cc0000") +
  theme_minimal() +
  labs(
    x = NULL,
    y = "Frequency",
    title = "Number of Discussion Posts per Week",
    subtitle = "Total count of comments/replies per week"
  ) +
  theme(plot.title = element_text(face = "bold"))

When viewing the posts’ time-frequency, the data goes back to 2016, but activity jumps at the start of 2020. Due to the way these were scraped from the Forums, starting with the newest posts and working backward, it should be the case that we are loaded more in the current year. There is a dip in activity around mid-2020; this most likely is an error with scraping vs. lack of activity on the forum. For the sake of this analysis is focused mostly on text; it’s not critical to understanding.

Outlier Analysis

The following post is the longest one in our dataset. We can examine it to see if it is valid or not regarding our analysis. If it is not, we will remove it, otherwise it will remain in the analysis.

df_select %>%
  filter(text_len > 5000) %>%
  select(Discussion) %>%
  head(n=1)

Note: Since this is discussion forum text, outliers are only long posts as demonstrated above. They will remain in the dataset since longer text often contains valuable information.

Text Cleaning

To better machine-analyze the text extracted from the forum, standard text cleaning is performed to normalize the text. Additionally, the text is lemmatized, transforming words to their lemma, or base word. We will not remove numbers in this operation since we focus on the Model 3, containing a number in its name.

df_select$Discussion <- iconv(df_select$Discussion, "latin1", "ASCII", sub = "")
df_select$Discussion <- str_replace_all(df_select$Discussion,"\\n"," ")
df_select$Discussion <- str_replace_all(df_select$Discussion,"@","")
df_select$Discussion <- str_replace_all(df_select$Discussion,"="," ")
df_select$Discussion <- str_replace_all(df_select$Discussion,"-"," ")
df_select$Discussion <- gsub("http[[:alnum:][:punct:]]*", "", df_select$Discussion)
df_select$Discussion = removePunctuation(df_select$Discussion)
df_select$Discussion = stripWhitespace(df_select$Discussion)
df_select$Discussion = tolower(df_select$Discussion)
df_select$Discussion = removeWords(df_select$Discussion, c(stopwords('english')))
df_select$Discussion = lemmatize_strings(df_select$Discussion)
tidy_df <- df_select %>%
  unnest_tokens(word, Discussion)

Sentiment Analysis

Sentiment analysis is the process of systematically identifying the emotion of different words in a text corpus. There are several methods available, from text-based lexicon lookups to more advanced machine learning-based models that consider sentence structure. For this exercise, we’ll examine the text through various lexicon-based methods.

Bing Sentiment Lexicon

Using the Bing Lexicon from Bing Liu and collaborators, adds the column “Sentiment” and mark each word as positive or negative.

https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html

bing_df <- tidy_df %>%
  inner_join(get_sentiments("bing"), by = "word")

bing_df %>%
  group_by(sentiment) %>%
  summarise(count = n(), .groups = "keep")

Based on a pure lookup, the text in the forum is overall positive. with ~83k positive values and ~69k negative values.

AFINN scoring Lexicon

AFINN from Finn Årup Nielsen, adds the value column, with a numeric representation of how positive, or negative the word is. The AFINN lexicon measures sentiment with a numeric score between -5 and 5

http://www2.imm.dtu.dk/pubdb/pubs/6010-full.html

afinn_df <- tidy_df %>%
  inner_join(get_sentiments("afinn"), by = "word")

afinn_df %>% 
  select(word, value) %>% 
  head(n=10)

After applying the AFINN lexicon, you can see the different values applied to each word, with varying polarity levels. In the above example, words like, fit, yes, and agree all have a +1 value where good has a +3 value.

afinn_df %>%
  ggplot(aes(x = value)) +
  geom_histogram(bins = 10, show.legend = FALSE, fill="#cc0000", color="#ffffff") +
  scale_x_continuous(breaks = c(-5, -3, -1, 1, 3, 5)) +
  theme_minimal() +
  scale_colour_grey(start = 0.3, end = .8) +
  labs(
    x = NULL,
    y = NULL,
    title = "Distribution of AFINN Sentiment Scores by Value",
    subtitle = "Count of occurences of each score value"
  ) +
  theme(plot.title = element_text(face = "bold"))

For the dataset overall, there is a slight left-skew showing there is a greater concentration of words with positive values. There are very few high and low values (-5, +5).

Note: 0 is not a valid value in this system. Therefore the bin is empty

Sentiment Visualizations

Now that we have sentiments applied to our tokenized words, we can do some simple analysis on the overall sentiment, the top words, and sentiment over time.

Top Word Counts (BING)

bing_df %>%
  count(word, sort = TRUE, sentiment) %>%

  group_by(sentiment) %>%
  top_n(15) %>%
  ungroup() %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(n, word, fill = sentiment)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~sentiment, scales = "free") +
  theme_minimal() +
  scale_fill_manual(values=c("#cc0000", "#cccccc")) +
  labs(x = "Contribution to sentiment",
       y = NULL,
    title = "Top 15 Positive and Negative Words",
    subtitle = "Using BING Sentiment Lexicon"
  ) +
  theme(plot.title = element_text(face = "bold"))

Focusing on the Negative words, the top occurrence is issue, and the second is problem. Given this forum’s nature, talking about a product, these are very practical words to be on the top of the list. People are reporting or discussing issues and problems with their cars. Bug, Noise, Break, and Damage all feel like perfect matches as well.

The word Numb is present because numb is the lemma of number. We demonstrated here by calling the lemmatize_words function. This mis-classification is an excellent example of one of the pitfalls with lexicon-based sentiment analysis. After stemming or lemmatizing the word, it can alter its meaning. Usually, “number” would be a neutral word (e.g., not in the BING Lexicon).

sprintf("The lemma of Number is %s", lemmatize_words("number"))
## [1] "The lemma of Number is numb"

Overall Top Words (BING)

bing_df %>%
  count(word, sort = TRUE, sentiment) %>%
  top_n(30) %>%
  ungroup() %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(n, word, fill = sentiment)) +
  geom_col(show.legend = TRUE) +
  scale_fill_manual(values=c("#cc0000", "#cccccc")) +
  theme_minimal() +
  labs(x = "Contribution to sentiment", y = NULL,
    title = "Top 30 Sentiment Words",
    subtitle = "Grouped by BING Sentiment Classification"
  ) +
  theme(plot.title = element_text(face = "bold"))

As opposed to the top 15 of each sentiment, stacking the top 30 overall words gives us an idea of the proportion of positive vs. negative for the dataset. Of the top 30, 11 of the 30 are negative, including the mis-classified numb.

Sentiment over Time (AFINN)

plot_df2 <- afinn_df %>%
  filter(Time > "2020-01-10") %>%
  mutate(mon = floor_date(Time, "day")) %>%
  group_by(mon) %>%
  summarize(value = mean(value), .groups = 'keep')

plot_df2$color <- ifelse(plot_df2$value < 0, "negative","positive")

ggplot(plot_df2, aes(mon, value, fill = color)) +
  geom_col(show.legend = FALSE) +
  theme_minimal() +
  scale_fill_manual(values=c("#cc0000", "#cccccc")) +
  labs(x = NULL, y = "Sentiment",
    title = "Sentiment by Week",
    subtitle = "Calculated by Mean AFINN sentiment score "
  ) +
  theme(plot.title = element_text(face = "bold"))

When we look at the data’s sentiment over time, the forums’ sentiment is positive as measured by the mean sentiment score by day. The absence of negative more negative-biased days doesn’t mean that there isn’t any negative feedback, just that overall, it is positive on average.

Topic Identification

Now that we know the sentiment, we can start identifying the topics most frequently discussed on the forum. For this exercise, we’ll employ two different methods. The first is a model called n-Grams, which looks and reoccurring sequences of words6. We will use bi-grams, which looks at two-word combinations and their frequency. The second model is known as Topic Modeling. Topic modeling is a method for unsupervised classification which finds natural groups of items, regardless if you know anything about the text7.

Bi-Grams

bigram_counts <- df_select %>%
  unnest_tokens(bigram, Discussion, token = "ngrams", n = 2) %>% 
  count(bigram, sort = TRUE) %>%
  separate(bigram, c("word1", "word2"), sep = " ") %>% 
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word) %>% 
  drop_na()

head(bigram_counts, n=20)

The top two highest reoccurring bigrams are model 3 and service center. These two are clearly great candidates for the Model 3 discussion forum. However; after the top two we can start to identify potential candidates for highly discussed product features.

We can visualize these association as well using a graph.

bigram_graph <- bigram_counts %>%
  filter(n > 240) %>%
  graph_from_data_frame()

set.seed(2017)
a <- grid::arrow(type = "open", length = unit(.05, "inches"))

ggraph(bigram_graph, layout = "nicely") +
  geom_edge_link(arrow = a, end_cap = circle(.02, 'inches'), color="#cccccc") +
  geom_node_point(color = "#cc0000", size = 2) +
  geom_node_text(aes(label = name), vjust = -.4, hjust = -.1) +
  theme_minimal()

Some of the top bi-grams that result from the dataset that is more product-related are as follows:

  • Sentry - Mode: This is a unique feature of the car which records activity outside the vehicle via the cameras it uses for self-driving.
  • Speed - Limit: Most likely related to the possible limits when using the self-driving feature.
  • Tesla - App: The mobile app that is supported on Apple and Android devices.
  • Phone - Key: The mobile app is used to unlock the car.
  • Software - Update: Tesla’s go through regular software updates every 1-2 weeks.
  • Take - Delivery: Related to the purchase process.
  • Mile - Range: Being a battery powered car, the range is highly discussed.
  • Battery - Degradation: Similar to the range, do batteries retain their health.
  • Wall - Connector: The product name for the home charger.
  • Service - Center: Related to the location where service is performed.
  • Voice - Command: One of the most complained about features according to the Bloomberg report8.

Topic Modeling

After inspecting the most common word pairs used in the discussion forums, specifically in the longer text replies, next, we’ll take a look at trying to identify topics of discussion via Topic Modeling on the “subjects” of each of the topics. This method is an unsupervised method that automatically attempts to identify related topics based on the corpus of text.

corpus <- Corpus(VectorSource(df_topics$Topic))
corpus <- tm_map(corpus, tolower)
corpus <- tm_map(corpus, removePunctuation)  # remove punctuation
corpus <- tm_map(corpus, stripWhitespace)    # remove white space
corpus <- tm_map(corpus, removeWords, c(stopwords('english')))
corpus <- tm_map(corpus, lemmatize_strings) # lemmatizaton
# Manually remove odd characters that frequently appear
corpus <- tm_map(corpus,content_transformer(function(x) gsub("“", " ", x)))
corpus <- tm_map(corpus, content_transformer(function(x) gsub("”", " ", x)))
corpus <- tm_map(corpus, content_transformer(function(x) gsub("’", " ", x)))
corpus <- tm_map(corpus, removeWords, c("tesla", "model", "anyone", 
                                        "car", "get", "work", "use",
                                        "come", "question", "can",
                                        "issue", "now"))
dtm <- DocumentTermMatrix(corpus)
dtm = removeSparseTerms(dtm, .995)
sel_idx <- rowSums(as.matrix(dtm)) > 0
dtm <- dtm[sel_idx, ]

LDA

Latent Dirichlet Allocation (LDA), is a generative probabilistic model for collections of text that allows sets of observations to be explained by unobserved groups9. We can apply this method to the discussion topics and let the machine try to come up with the most probable topics grouped together.

Determining the Number of Topics

Since we don’t yet know how many topics are being discussed in the forums, we need to estimate how many we should attempt to create. One of the most reliable and standard methods would be to employ a subject matter expert to approximate what those might be. Alternatively, there are other methods, such as calculating Perplexity, that can assist. Perplexity is a measurement of how well a probability distribution or probability model predicts a sample. By plotting Perplexity, we can visualize an approximate number of topics where the model stops improving10.

mod_perplexity = numeric(0)
topics <- c(2:15)  

for (i in topics){
  mod <- LDA(dtm, k = i, method = "Gibbs", 
             control = list(seed=1234) )
  mod_perplexity[i] = perplexity(mod, dtm)
}
mod_perplexity <- mod_perplexity[!is.na (mod_perplexity)]
plot(x=topics, y=mod_perplexity, type = "b", xlab = "Number of Topics", ylab = "Perplexity")

Upon plotting, we can see the number of topics stops improving at 10 and again at 13 using Gibbs sampling.

Top-Level Topics

Next we can create a visualization of the words the belong to these 10 LDA topics, and can inspect them to see if they make sense.

lda <- LDA(dtm, k = 10, method="Gibbs", control = list(seed = 1234))
topics <- tidy(lda, matrix = "beta")
top_terms <- topics %>%
  group_by(topic) %>%
  top_n(7, beta) %>%
  ungroup() %>%
  arrange(topic, -beta)


top_terms %>%
  mutate(term = reorder_within(term, beta, topic)) %>%
  ggplot(aes(beta, term, fill = factor(topic))) +
    geom_col(show.legend = FALSE) +
    facet_wrap(~ topic, scales = "free", ncol=2) +
    theme_minimal(base_size = 16) + 
    scale_fill_manual(values=c("#cc0000", "#333333", "#666666", "#999999", "#cccccc", 
                               "#cc0000", "#333333", "#666666", "#999999", "#cccccc")) +
    scale_y_reordered()

Based on the output, here are some of the words that stand out, potentially forming topics.

  1. Range, battery, dashcam: Battery and range related discussion.
  2. Battery, Trip, and Range: Issues related to long-distance driving.
  3. Tire upgrade for performance: could be related to the performance model, and it’s tire selection.
  4. Door, Service, Paint, Trunk: This sounds like it’s related to some of the more significant issues owners have experienced when picking up their cars.
  5. iPhone, update, lock: Potentially related to the new features released to have your app automatically lock doors when left unattended.
  6. Sentry Mode: This is a unique feature of the car which records activity outside the vehicle via the cameras it uses for self-driving.
  7. Full Self Driving, App, HW3: Potentially related to purchasing the FSD update via the mobile app. HW3 refers to which generation of full self-driving hardware your car has. Older models can’t support all features.
  8. Drive, Brake, Autopilot, Park (Upgrading to Full Self Driving potentially, which can be purchased via the mobile application.)
  9. Phone, USB, Back: Appears to interior phone and connectivity options. Such as USB in the back seats.
  10. Charge, Autopilot, Wheel, Noise: Autopilot related issues. The much stronger first value (charge) versus the other words in this topic.

With the LDA version versus the bigram modeling, it seems a little clearer when using the bigram approach, but we do see some other topics show up like Full Self Driving, and Software Updates / Upgrades, paint, and trunk issues. These both can be taken into account when examining product feature related issues.

Product Feature Analysis

After analyzing the most common customer topics, we’re going to look at sentiment around four of them. Out of the identified initial topics. We will use a specific method here referred to as a targeted dictionary analysis11, and we will look at the following:

  1. Full Self Driving: FSD, a much-anticipated feature rolled out in private beta in 2020. There has been quite a bit of coverage of this from early adopters on YouTube.
  2. Software - Update: Tesla’s go through regular software updates every 1-2 weeks.
  3. Mile - Range & Battery - Degradation: The Model 3 is a battery-powered car; the range is highly discussed.
  4. Voice - Command: Related to the feature to give the car voice commands.
my_corpus <- corpus(df_select$Discussion)  # build a new corpus from the texts

quant_dfm <- dfm(my_corpus, )
quant_dfm <- dfm_trim(quant_dfm, min_termfreq = 4, max_docfreq = 10)

# Reduce the columns to just what's needed
quant_tesla <- select(df_select, doc_id, Discussion, User, Time)

# Quanteda requires the text field to be called "text"
quant_tesla <- quant_tesla %>%
  rename(text = Discussion)

# Create the Corpus
corp_tesla <- corpus(quant_tesla)

# Add columns for Year, Month, and Week Number
corp_tesla$year <- year(corp_tesla$Time)
corp_tesla$month <- month(corp_tesla$Time)
corp_tesla$week <- week(corp_tesla$Time)

# Subset the Corpus for Just 2020
corp_tesla <- corpus_subset(corp_tesla, "year" >= 2020)
toks_tesla <- quanteda::tokens(corp_tesla, remove_punct = TRUE)

Full Self Driving

Full Self Driving is a new feature in Beta for Teslas that will autonomously drive your car without an intervention from start to stop on a destination programmed into the navigation system. This feature had been promised for years and finally started rolling out in October to a limited audience. It’s much anticipated and highly coveted. It should be received positively.

# get relevant keywords and phrases
fsd <- c("fsd", "self driving", "autopilot")

# only keep tokens specified above and their context of ±10 tokens
toks_fsd <- tokens_keep(toks_tesla, pattern = phrase(fsd), window = 10)

toks_fsd <- tokens_lookup(toks_fsd, dictionary = data_dictionary_LSD2015[1:2])

# create a document document-feature matrix and group it by weeks in 2016
dfmat_fsd_lsd <- dfm(toks_fsd) %>% 
    dfm_group(group = "week", fill = TRUE) 

matplot(dfmat_fsd_lsd, type = "l", xaxt = "n", lty = 1, ylab = "Frequency", col = c("#cc0000", "#cccccc"), 
        main = "Sentiment of Full Self Driving for 2020")
grid()
axis(1, seq_len(ndoc(dfmat_fsd_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_fsd_lsd)) - 1))
legend("topleft", col = c("#cc0000", "#cccccc"), legend = c("Negative", "Positive"), lty = 1, bg = "white")

There is a clear difference in the proportion of positive sentiment to negative over time. There are five or so spikes in peak positive sentiment frequency; these are possibly mapped to the news, such as the original announcement of the FSD Beta availability on October 12th.

Tesla will release ‘Full Self-Driving’ beta next week, Musk says on CNET.com

n_fsd <- ntoken(dfm(toks_fsd, group = toks_fsd$week))
plot((dfmat_fsd_lsd[,2] - dfmat_fsd_lsd[,1]) / n_fsd, 
     type = "l", ylab = "Sentiment", xlab = "", xaxt = "n", col = c("#cccccc"),
     main = "Mean Sentiment of Full Self Driving for 2020")
axis(1, seq_len(ndoc(dfmat_fsd_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_fsd_lsd)) - 1))
grid()
abline(h = 0, lty = 2)

Plotting the combined Mean Sentiment, we see that other than a couple of dips into negative; it’s been positive overall.

Battery and Range

Next is the topic of Range, Battery, Charging, and Degradation. Key words that were frequently in the bigrams as well in the topic modeling. There is a phenomenon called Range Anxiety, defined as “Worry on the part of a person driving an electric car that the battery will run out of power before the destination or a suitable charging point is reached.12” Tesla has an industry-leading range in their cars, yet still, it’s such a new concept for people that it’s highly discussed. We can examine the sentiment around these topics next.

# get relevant keywords and phrases
bat <- c("battery", "charge", "range", "degradation")

# only keep tokens specified above and their context of ±10 tokens
toks_bat <- tokens_keep(toks_tesla, pattern = phrase(bat), window = 10)

toks_bat <- tokens_lookup(toks_bat, dictionary = data_dictionary_LSD2015[1:2])

# create a document document-feature matrix and group it by weeks in 2016
dfmat_bat_lsd <- dfm(toks_bat) %>% 
    dfm_group(group = "week", fill = TRUE) 

matplot(dfmat_bat_lsd, type = "l", xaxt = "n", lty = 1, ylab = "Frequency", col = c("#cc0000", "#cccccc"),
        main = "Sentiment of Battery/Charging/Range for 2020")
grid()
axis(1, seq_len(ndoc(dfmat_bat_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_bat_lsd)) - 1))
legend("topleft", col = c("#cc0000", "#cccccc"), legend = c("Negative", "Positive"), lty = 1, bg = "white")

This time we see the positive sentiment is lower than the negative on average for the week’s discussions. There is also a single event that stands out around the summer time frame. Around this time, Tesla’s CEO, Elon Musk, announced a formal day for their anticipated “Battery Day13,” potentially the spike in negative news. Further investigation into the posts around that time could reveal the actual negative topics.

n_bat <- ntoken(dfm(toks_bat, group = toks_bat$week))
plot((dfmat_bat_lsd[,2] - dfmat_bat_lsd[,1]) / n_bat, 
     type = "l", ylab = "Sentiment", xlab = "", xaxt = "n", col = c("#cc0000"),
     main = "Sentiment of Battery/Charging/Range for 2020")
axis(1, seq_len(ndoc(dfmat_bat_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_bat_lsd)) - 1))
grid()
abline(h = 0, lty = 2)

When looking at the combined mean weekly sentiment, the entire year for this topic is negative. This is an area worth further investigation.

Software Updates

One of the marquis features of a Tesla is its ability to improve itself via software updates continuously. The updates come about every two weeks and bring new capabilities such as enhanced self-driving, new games, or even additional performance and range. This capability is a one of a kind capability that makes a Tesla feel more like a smart phone than a car.

# get relevant keywords and phrases
sw <- c("software", "update")

# only keep tokens specified above and their context of ±10 tokens
toks_sw <- tokens_keep(toks_tesla, pattern = phrase(sw), window = 10)

toks_sw <- tokens_lookup(toks_sw, dictionary = data_dictionary_LSD2015[1:2])

# create a document document-feature matrix and group it by weeks in 2016
dfmat_sw_lsd <- dfm(toks_sw) %>% 
    dfm_group(group = "week", fill = TRUE) 

matplot(dfmat_sw_lsd, type = "l", xaxt = "n", lty = 1, ylab = "Frequency", col = c("#cc0000", "#cccccc"),
        main = "Sentiment of Software Updates for 2020")
grid()
axis(1, seq_len(ndoc(dfmat_sw_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_sw_lsd)) - 1))
legend("topleft", col = c("#cc0000", "#cccccc"), legend = c("Negative", "Positive"), lty = 1, bg = "white")

Overall, the sentiment is more positive than negative. That final spike in October coincides with the spike from the Full Self Driving Beta announcement noted above. Since that’s delivered via a software update, most likely, these topics overlap.

n_sw <- ntoken(dfm(toks_sw, group = toks_sw$week))
plot((dfmat_sw_lsd[,2] - dfmat_sw_lsd[,1]) / n_sw, 
     type = "l", ylab = "Sentiment", xlab = "", xaxt = "n", col = c("#999999"),
     main = "Sentiment of Software Updates for 2020")
axis(1, seq_len(ndoc(dfmat_sw_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_sw_lsd)) - 1))
grid()
abline(h = 0, lty = 2)

Mostly positive over the year, with an exception in September. This should be investigated to understand further why sentiment was negative at this time.

Voice Command

According to the survey conducted by Bloomberg of 5,000 Tesla Model 3 owners, the two areas of the car with the least positive feedback were its Voice Command Capabilities and the lackluster Automatic Wipers14. The survey was taken in mid-2019; we can look at the data from 2020’s discussions to see how sentiment is trending.

# get relevant keywords and phrases
voice <- c("voice", "command")

# only keep tokens specified above and their context of ±10 tokens
toks_voice <- tokens_keep(toks_tesla, pattern = phrase(voice), window = 10)

toks_voice <- tokens_lookup(toks_voice, dictionary = data_dictionary_LSD2015[1:2])

# create a document document-feature matrix and group it by weeks in 2016
dfmat_voice_lsd <- dfm(toks_voice) %>% 
    dfm_group(group = "week", fill = TRUE) 

matplot(dfmat_voice_lsd, type = "l", xaxt = "n", lty = 1, ylab = "Frequency", col = c("#cc0000", "#cccccc"),
        main = "Sentiment of Voice Command for 2020")
grid()
axis(1, seq_len(ndoc(dfmat_voice_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_voice_lsd)) - 1))
legend("topleft", col = c("#cc0000", "#cccccc"), legend = c("Negative", "Positive"), lty = 1, bg = "white")

Overal, it appears tha the data is more positive than negative; a potential clue that the system is improving in the minds of users.

n_voice <- ntoken(dfm(toks_voice, group = toks_voice$week))
plot((dfmat_voice_lsd[,2] - dfmat_voice_lsd[,1]) / n_voice, 
     type = "l", ylab = "Sentiment", xlab = "", xaxt = "n", col = c("#999999"),
     main = "Sentiment of Voice for 2020")
axis(1, seq_len(ndoc(dfmat_voice_lsd)), ymd("2020-01-01") + weeks(seq_len(ndoc(dfmat_voice_lsd)) - 1))
grid()
abline(h = 0, lty = 2)

Over the 2020 year, it appears that it’s trending slight more postive over time as well. Further investigation needed.

Conclusion

It’s no secret that Tesla’s success with the Model 3 is echoed in the owner’s sentiment. With the overall sentiment, we saw that topics were overwhelmingly positive, and customers appear to be very happy with their cars. When exploring product features to investigate, the combination of bigrams and topic modeling with LDA uncovered many of the hottest items in the forums.

First, Full Self Driving or FSD had a momentous year with the release of numerous updates and the long-anticipated FSD Beta released to select owners. The beta was met with very positive feedback and continued anticipation.

Next, Batter Life, Range, and Charging were explored. No question, this was found to be the area with the most negative sentiment. While Tesla leads the industry in price/performance/range for Electric cars, it is still a widely discussed topic with negative sentiment.

Software Updates are a much-loved feature that brings new capabilities to an owner’s car regularly. Not only are these much talked about they’re also very much loved by users. A critical differentiation that can continue to be leveraged for innovation and differentiation.

Finally, the Voice Command feature of the car, which had previously received negative feedback, appears to have started to make upwards trends of customer sentiment trends. This method of identifying customer sentiment is only a cursory pass at understanding real feedback. Further investigation should be considered.

I recommend that these are four of the top discussed features with a mix of positive and negative sentiment. Further analysis should be done to identify what caused the change in sentiment, or the spikes, and a more in-depth look through the existing threads that drive these discussed to look beyond machine-based analysis and employ ethnographic research practices to synthesize the data. While even tho the machine is a simplistic view of sentiment (lexicon-based), the ability to narrow down topics, patterns, and trends across tens of thousands of discussions is a powerful tool in a product organization’s toolbelt.

References

Tesla Color Pallet